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Abstract 

Detailed balance is an overly strict condition to ensure a valid Monte Carlo 
simulation. We show that, under fairly general assumptions, a Monte Carlo 
simulation need satisfy only the weaker balance condition. Not only does our 
proof show that sequential updating schemes are correct, but also it establishes 
the correctness of a whole class of new methods that simply leave the Boltzmann 
distribution invariant. 
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The theoretical basis for Monte Carlo simulation is Markov chain theory. In a Monte 
Carlo simulation, the configuration of a system of interest is updated by a series of moves, and 
the moves are chosen so that the updating process forms a Markov chain. Let us consider, 
without loss of generality, that the system is in the canonical ensemble. We then desire that 
the configurations generated by the Markov chain sample the Boltzmann distribution, after 
an initial transient, "equilibration" period. We would like, in other words, that the limiting 
distribution of the Markov chain exists, is unique, and is the Boltzmann distribution. This 
result is assured if the Markov chain is regular and satisfies the detailed balance condition.0 
The proof of this result uses the Perron-Frobenius theorem and the fact that a matrix 
obeying detailed balance has a complete set of eigenvectors. 

There are many Monte Carlo methods in wide use, however, that do not satisfy detailed 
balance. The classic example is the sequential updating method.! In such a method, the 
reverse move of sequentially updating in the reverse order never occurs, and so the method 
cannot satisfy detailed balance. The correctness of these simulations cannot be justified 
with the standard proof, which relies on the detailed balance condition. In fact, there 
appears to be no proof available in the literature that would assure us that these methods 
are correct. Yet, they are in wide use. Many specific examples can be given. One recent 
example is sequential updating of spins in the Ising model.! Another is sequential updating 
of surface patches in membrane simulations.! Of interest to the chemistry and chemical 
engineering community is the original implementation of the Gibbs ensemble, where particle 
displacements, volume changes, and particle exchanges were carried out in a fixed order.! 
All implementations to date of parallel tempering, a powerful method for sampling "glassy" 
systems, perform a fixed sequence of particle moves and system swaps. Finally, several 
hybrid methods that combine Monte Carlo and molecular dynamics in a sequential fashion 
have been introducediil A variety of more general methods have been introduced that are 
guaranteed only to leave the Boltzmann distribution invariant, not to converge to it .0 Given 
the lack of a rigorous proof of correctness of these methods and the general belief in the 
importance of detailed balance, these methods have been criticized by various authors. 

In this short communication, we show that detailed balance is too strict a condition. 
In particular, we show that the weaker balance condition is sufficient and necessary. The 
balance condition simply requires that the Markov chain leave the Boltzmann distribution 
invariant. To be precise, we show that a set of Monte Carlo moves is valid if they 1) lead to 
Markov sampling, 2) lead to regular sampling, and 3) satisfy the balance condition. 

The organization of our paper is as follows. We first formulate in matrix notation the 
master equation that governs the Markov process. Next, we state precisely our theorem. 
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We then prove the theorem, first in a heuristic way and then in a rigorous, formal fashion. 
We conclude with a discussion of the implications of this theorem. 

A Monte Carlo simulation can be understood as an implementation of a discrete-time 
master equation. The state of the system at n Monte Carlo steps (MCS) is related to that 
at n — 1 MCS by a transition matrix: 

X n = AX„_! , (1) 

where the probability that the system is in state % at n MCS is given by the i-th component 
of the vector vector x n , and A is the transition matrix. In this formulation, the sequence of 
steps forms a Markov chain. For a method that satisfies detailed balance, each step can be 
one of the many possible moves, chosen at random. For a sequential updating procedure, 
each step will be an entire sweep through a sequential series of moves. 

The transition matrix satisfies certain obvious properties. The first is that the matrix is 
non-negative: 

Aij>0 Vi, Vj, (2) 

which ensures that the probabilities remain positive. The second obvious condition is that 
the transpose of the matrix is stochastic: 

E^' = 1 Vj, (3) 

i 

which ensures that the transition matrix conserves total probability. It is well known that 
the set of Monte Carlo moves must lead to ergodic sampling, otherwise the system will not 
sample all of phase space. In matrix notation, this is expressed as [-A m ]jj > for all % and 
j, for some to, which may depend on % and j. We will need the slightly stronger condition 
that the set of moves leads to regular sampling. This condition is expressed as 

[A m ]ij > for all i,j for some fixed to . (4) 

Note that a set of moves that leads to ergodic sampling can be converted into a set of moves 
that leads to regular sampling with the addition of some null moves: 

A' = (1 -a)A + aI, (0 < a < 1) , (5) 

where I is the identity matrix. Finally, if the desired distribution, i.e. the Boltzmann 
distribution, is denoted by x*, the balance condition is simply that 

Ax* = x* . (6) 
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We note that the detailed balance condition is expressed in matrix notation as AijX* = AjiX*. 
We further note that if matrix A satisfies the balance condition, then the matrix A' in Eq. 
(0) also satisfies the same balance condition. 

Our theorem is that a set of Markov moves satisfying conditions (|||-[D always converges 
to a limiting distribution, which is unique. It follows that if by some means we know that 
the moves satisfy the balance condition (|6|), then the simulation will eventually converge to 
sampling the desired distribution x*. 

We first present a simple proof of this theorem. The Perron theorem implies that the 
matrix A m has a single, unique maximum eigenvalue, A, with an associated eigenvector, 
x*, with non-negative entries (Theorems 1.4.4 and 1.8.1 of Ref. plj). By taking a sum 
over components in the equation A m x* = Ax*, we conclude that A = 1. The limiting 
distribution, if it exists, is given by lim^oo A mn x , where x is the initial condition. Since the 
sampling is regular, this limit cannot depend on the initial condition, because essentially all 
initial conditions are sampled during the initial equilibration period. By choosing an initial 
condition of x = x*, we conclude that the limiting distribution of A m is x*. Since the initial 
condition is irrelevant, we can equally well consider limits such as lim^oo v4 mn L4 fc x ], which 
simply correspond to different choices for initial conditions. But these limits all must be x* 
for any k by the same argument. That all of these limits are equal implies that 

lim A n x = x* , (7) 

n^oo 

where the m has been eliminated, and the theorem is proved. 

We now present an alternative and more formal proof of the theorem. From conditions (0) 
and @ and Lemma 3.1.1 of Ref. [15] we conclude that the spectral radius of A is unity. From 
conditions (0)-(fD an d the Perron- Frobenius theorem (Theorem 1.4.4 of Ref. [15]) we conclude 
that 1 is an eigenvalue of A and that the components of the corresponding eigenvector are 
non-negative. We denote this eigenvector by w^ 1 ). From condition (|J) we can, in fact, reach 
the stronger conclusion that all other eigenvalues of A have a modulus strictly less than 
unity (Theorem 1.8.1 of Ref. ]15|). Page 40 of Ref. [15] then states that 



lim A n 

n— >oo 



lim 



(X-1)(XI-A) 



-i 



(8) 



We now proceed to evaluate the limit in Eq. (|S|). We first decompose the / x / matrix A 
into its Jordan normal form:0 A = \VJW~ 1 , where 
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Here Aj are the eigenvalues of A, and bi are either or 1 depending on the degrees of the 
elementary divisors. We find that 
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(10) 



Carrying out the inversion, we find 
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Now since Ai = 1 is the only eigenvalue of unit modulus, the associated elementary divisor 
has degree 1, and &i = 0. We then see that 
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(A-1)(A/-J)- 1 



1 ... 
... 



ee 



(12) 



where e\ = 1, and = otherwise. We denote the columns of W by w^. Since W is 
by definition a non-singular matrixjll the vectors form a complete basis, and we can 



decompose the initial condition as x 
we have 



EL«; wW = Wa. From Eqs. (§, (|T0|), and (0 
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(13) 



where in the last step we have used Eq. (||]) and the fact that both xo and w^ 1 ) are normalized 
probability distributions. This completes the rigorous proof of the theorem. 

The implications of our theorem are significant. First, the sequential updating approach, 
widely practiced and nearly as widely criticized, is correct. All that is required is that the 
sweep leave the Boltzmann distribution invariant, in addition to leading to regular sampling. 
Indeed, there are numerical reports that the sequential updating approach leads not only to 
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correct sampling but also to more efficient sampling in a variety of contexts, ranging from 
image processing to membrane physics.^ There are some simple cases, however, for which 
random Metropolis updating is ergodic and regular, but sequential Metropolis updating is 
not. One is the Ising model at infinite temperature, where the configurations produced by 
sequential updating simply oscillate between the initial configuration and the configuration 
with all spins flipped. Another example is the 2x2 Ising model at any temperature. If the 
spins are updated in the order ^ ^ , then the initial configurations and _ 

lead to sequences that also oscillate between spin-flipped configurations. In these exam- 
ples, our theorem does not guarantee convergence to the Boltzmann distribution, because 
assumption (^) is violated. If, however, the individual spin-flip moves within the sweep 
were performed with a probability 1 — a, as in Eq. (^), then convergence to the Boltzmann 
distribution would occur in both of these examples. 

Note that our theorem requires the sampling to be regular, not simply ergodic. Indeed, 
a method that is ergodic may not converge to the invariant distribution, even if it satisfies 
detailed balance. A simple example is provided by a two-state system. The transition matrix 
j q leads to ergodic, but not regular, sampling. This transition matrix satisfies detailed 
balance with the invariant distribution x\ = x\ = 1/2. The limit lim n _ J . tX3 A n xo does not 
exist for a general initial condition, however, because A u xq simply alternates between xq 
and the swapped state for even and odd n. 

There has been concern that perhaps one should not use the data obtained from within 
the sweep of a sequential move, or that perhaps the statistics collected would depend on how 
the sweep is defined. In fact, we can show that all of the data can be used, and the results do 
not depend on how the sweep is defined. For simplicity, let us consider a sweep composed of 
iV steps: A = llfeLi E>( k \ Our general theorem says that the data collected via the procedure 
x„ = A n xo will sample the Boltzmann distribution. Moreover, after the initial equilibration 
period, the initial condition xo will not affect the results. We can, therefore, equally well 
consider the exact same sequence of moves but collect data at a shifted period within the 
sweep: x„ = [B^ II^lTi 1 B^'^B^Xq]. This approach, which uses the data within the 
sweep, is also susceptible to our general theorem. We can generalize this argument to any 
arbitrary shifting of the data collection period. By our theorem, all of the data so collected 
will be sampling from the Boltzmann distribution once the equilibration regime has passed. 
The data from within the sweep, therefore, can be used, and the average statistics will not 
depend on which data are used. 

We note that a simple way to ensure that a sequential sweep satisfy the balance condition 
is to require that each of the moves within the sweep satisfy local detailed balance: B^x* = 
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B^x*. Indeed, the local detailed balance condition is satisfied in all of the sequential Monte 
Carlo methods previously cited. By using this condition, one can show that the balance 



condition, Eq. ([]), is satisfied: 
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Here we have used local detailed balance condition to achieve the third line and the fact that 
the transpose of each B^ is stochastic to achieve the fourth line. The final line is achieved 
by induction. 

More generally, our result implies that any Monte Carlo method that leaves the Boltz- 
mann distribution invariant is correct as long as it leads to regular, Markov sampling. The 
type-R and type-M transitions within the dynamic weighting method of Wong and Liang, 
for example, are known only to leave the Boltzmann distribution invariant.0 Although the 
method led to efficient and apparently accurate sampling, the formal correctness of this 
method has not been shown. Our general theorem formally establishes the validity of this 
and other such methods. 

In summary, we have clarified a long-standing and controversial issue in Monte Carlo 
simulations by establishing the balance condition as a necessary and sufficient fundamental 
requirement. This condition is substantially weaker than the detailed balance condition. 
Although our proof of convergence demands that the set of Monte Carlo moves lead to 
regular sampling, this is the case for almost all sets of moves that lead to ergodic sampling. 
We have shown that the local detailed balance condition is one means of constructing a 
sequential Monte Carlo scheme satisfying the balance condition. 
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